By mounting inexpensive portable imaging devices on each train, we can collect real-time image information on each of the tracks being crossed by processing the images using Spark Image Layer.
The first question is how the data can be processed. The basic work is done by a simple workflow on top of our Spark Image Layer. This abstracts away the complexities of cloud computing and distributed analysis. You focus only on the core task of image processing.
Beyond a single train, our system scales linearly to multiple trains and computers to keep the computation real-time
With cloud-integration and Big Data-based frameworks, even handling an entire train network with 100s of trains running continuously is an easy task without worrying about networks, topology, or fault-tolerance.
The images which fly past the train at hundreds of meters per second are rich in information about the tracks, structure, and even details to potential upcoming dangers. The first basic task is the segmentation of the tracks which can provide information on their separation, surface smoothness, and direction.
Each time point from the video is shown here as a point and line corresponding to the left and right tracks.
Distributions of the optically estimated track surface can then be made for each track over the course of the journey.
The distance between the two tracks can also be estimated over the entire journey.
The information from each camera can be displayed with position instead of time, and displayed visually using mapping APIs.
Once the Spark Cluster has been created and you have the StreamingSparkContext called ssc (automatically provided in Databricks Cloud or Zeppelin), the data can be loaded using the Spark Image Layer. Since we are using real-time analysis, we acquire the images from a streaming source
val wr = TrainCameraReceiver("sbb://train-3275")
val metaImageStream = ssc.receiverStream(wr)
Although we execute the command on one machine, the analysis will be distributed over the entire set of cluster resources available to ssc. To further process the images, we can take advantage of the rich set of functionality built into Spark Image Layer
def identifyTracks(time: Double, pos: GeoPos, inImage: Img[Byte]) = {
val rawTrack = inImage.
run("Median...","radius=3").
run("Tubeness...").
run("Threshold","OTSU").
val trackShape = rawTrack.
componentLabel().
filter(_.area>50).
shapeAnalysis()
TrackInformation(
smoothness=rawTrack.smoothness(),
separation=calcSep(trackShape)
)
}
val trackStream = metaImageStream.map(identifyTracks)
ssc.start()
Analysis powered by Spark Image Layer from 4Quant, Visualizations, Document Generation, and Maps provided by:
To cite ggplot2 in publications, please use:
H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009.
A BibTeX entry for LaTeX users is
@Book{, author = {Hadley Wickham}, title = {ggplot2: elegant graphics for data analysis}, publisher = {Springer New York}, year = {2009}, isbn = {978-0-387-98140-6}, url = {http://had.co.nz/ggplot2/book}, }
To cite package ‘leaflet’ in publications use:
Joe Cheng and Yihui Xie (2014). leaflet: Create Interactive Web Maps with the JavaScript LeafLet Library. R package version 0.0.11. https://github.com/rstudio/leaflet
A BibTeX entry for LaTeX users is
@Manual{, title = {leaflet: Create Interactive Web Maps with the JavaScript LeafLet Library}, author = {Joe Cheng and Yihui Xie}, year = {2014}, note = {R package version 0.0.11}, url = {https://github.com/rstudio/leaflet}, }
To cite plyr in publications use:
Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software, 40(1), 1-29. URL http://www.jstatsoft.org/v40/i01/.
A BibTeX entry for LaTeX users is
@Article{, title = {The Split-Apply-Combine Strategy for Data Analysis}, author = {Hadley Wickham}, journal = {Journal of Statistical Software}, year = {2011}, volume = {40}, number = {1}, pages = {1–29}, url = {http://www.jstatsoft.org/v40/i01/}, }
To cite the ‘knitr’ package in publications use:
Yihui Xie (2015). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.10.
Yihui Xie (2013) Dynamic Documents with R and knitr. Chapman and Hall/CRC. ISBN 978-1482203530
Yihui Xie (2014) knitr: A Comprehensive Tool for Reproducible Research in R. In Victoria Stodden, Friedrich Leisch and Roger D. Peng, editors, Implementing Reproducible Computational Research. Chapman and Hall/CRC. ISBN 978-1466561595
To cite package ‘rmarkdown’ in publications use:
JJ Allaire, Joe Cheng, Yihui Xie, Jonathan McPherson, Winston Chang, Jeff Allen, Hadley Wickham and Rob Hyndman (2015). rmarkdown: Dynamic Documents for R. R package version 0.7. http://CRAN.R-project.org/package=rmarkdown
A BibTeX entry for LaTeX users is
@Manual{, title = {rmarkdown: Dynamic Documents for R}, author = {JJ Allaire and Joe Cheng and Yihui Xie and Jonathan McPherson and Winston Chang and Jeff Allen and Hadley Wickham and Rob Hyndman}, year = {2015}, note = {R package version 0.7}, url = {http://CRAN.R-project.org/package=rmarkdown}, }